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We propose a powerful method based on the Hoshen-Kopelman algorithm for simulating percola- 
tion asynchronously on distributed machines. Our method demands very little of hardware and yet 
we are able to make high precision measurements on very large lattices. We implement our method 
to calculate various cluster size distributions on large lattices of different aspect ratios spanning 
three orders of magnitude for two-dimensional site and bond percolation. We find that the nonuni- 
versal constants in the scaling function for the cluster size distribution apparently satisfy a scaling 
relation, and that the moment ratios for the largest cluster size distribution reveal a characteristic 
aspect ratio at r ~ 9. 

PACS numbers: 02.70.-c, 05.10.Ln, 05.70. Jk 

Although an old problem Q, percolation continues to attract a steady stream of papers HS0. High-quality 
numerical data are required to corroborate the many analytical results, particularly from conformal held theory 
0- 0- In this paper, we describe a method of simulating percolation that runs asynchronously in parallel 
on almost any hardware. In principle, the method relaxes all the standard constraints in numerical simulations of 
percolation, such as CPU power, memory, and network capacity. It is especially suited for calculating cluster size 
distributions, finite size corrections, crossing probabilities, and, by applying the corresponding boundary conditions, 
distributions of wrapping clusters on different topologies, e.g., cylinder, torus, or the Mobius strip. 

The Hoshen-Kopelman algorithm (HKA) [ij] is still the standard technique for identifying clusters in percolation, 
where a cluster is a set of sites connected via nearest neighbour interactions (site percolation) or active bonds (bond 
percolation). Strictly speaking, it is a type of data representation particularly suited for tracking clusters. Recently, 
Newman and Ziff Q have shown how to exploit this data representation to monitor the change in various observables 
as the occupation probability p is increased. The data representation efficiently encodes the connectivity of clusters 
in a large percolation system. In this paper, we show how to exploit this representation for different system sizes (up 
to rs 5 x 10 14 sites) and aspect ratios. The algorithm runs asynchronously in parallel over an almost arbitrarily slow 
network of computers. The network is hierarchically organised, and nodes on lower levels (slave nodes) can be slow 
and heterogeneous. In fact, the system scales like an ideal parallel computer: the overall computing time decreases 
linearly with the number of nodes, especially for large slave lattices (patches) where the overhead due to networking 
and related processing and the CPU-time at higher levels (master nodes) becomes negligible (see Tab. [|J. Other 
memory- efficient methods exist for constructing large clusters, for example Paul, Ziff, and Stanley |l0|. In that paper 
a variant of the Leath algorithm is used together with a data structure to record information about visited sites. 
As a result, memory is made available as and when it is required. In our method we can easily count the number of 
spanning clusters per realisation, apply different boundary conditions, rearrange patches for different aspect ratios, 
and gather statistics at every stage of lattice construction. 

We describe our method in detail for two-dimensional site percolation on a square lattice and present the overall 
cluster size distribution for different aspect ratios, as well as the universal moment ratios for the distribution of the 
order parameter in site and bond percolation. We find two surprising results. First, the nonuniversal amplitudes in 
the scaling function for the cluster number distribution numerically satisfy a scaling relation. Second, the moment 
ratios for the largest cluster size distribution all peak at r w 9, defining a characteristic aspect ratio. 

The basic idea of the method is that many slave nodes independently simulate lattices of equal linear size L in 
parallel using the HKA. These nodes send a special representation of their lattice border to a master node, which 
combines m of these patches to form a superlattice. The advantage of such a decomposition is that the master node 
can build up a very large superlattice while maintaining the large scale histograms. The master node can apply 
different boundary conditions and even reuse the same patches several times by rotating, mirroring, and permuting 
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them. 

The key to our algorithm is the representation of the lattice borders by the slave nodes. This is essentially a form of 
path compression (or Nakanishi label recycling where all border sites are considered active (i.e., possibly changing 
connectivity) and bulk sites are considered inactive. In this representation information about the connectivity and size 
of any cluster connected to the border is summarised entirely within border sites. The spatial information of clusters 
is neither required nor stored. Thus clusters not connected to the border are ignored, although their contribution to 
the cluster size histogram is recorded locally. 
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FIG. 1: (a) The lattice and the list of labels, l[i], as prepared by the HKA. (b) The border configuration suitable for the master 
node, after a clockwise border scan starting in the upper left-hand corner, with the list of labels now being irrelevant. For the 
reader's convenience, labels pointing to sites i in the new border carry a suffix ib- 

The HKA produces a list of labels, to which all active sites refer in order to identify their cluster, see Fig.^a). After 
the realisation of a lattice, a new border representation is prepared by visiting each border site in succession, indexed 
from 1 to 4L — 4, see Fig. db). The first site of a previously unscanned cluster contains the size of the cluster as a 
negative value in the range [—1, — L 2 }. This site is called the root. In the list of labels of the original representation, 
the label of this cluster is changed to indicate the new location of the root site in the border. All other sites in the 
border which belong to the same cluster refer to this site. The slave nodes send the border configuration in this 
representation to the master node. If required, clusters in the bulk have their sizes recorded in a local histogram, i.e. 
at the slave node that produced the lattice. This histogram is stored locally for the duration of the simulation. The 
master node is the only component that requires enough memory to hold the large histogram(s) usually generated in 
large scale simulations, while the slaves only need to store a very small amount of local data. 
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FIG. 2: (a) The configuration of the borders before two clusters merge at the marked labels. The labels in the right patch are 
shifted by 4L — 4 to make them unique, (b) The configuration of the borders after the merging procedure. Labels which have 
changed are shown in white. 

When two patches are combined by the master (gluing) it is possible that two clusters merge at the border. This is 
realised by setting one of the root labels (preferably from the smaller cluster) to point to the other, as shown in Fig. El 
The master's histogram is updated by removing both cluster sizes (4 and 8 in the example) and replacing them by 
their sum. By adding the site-normalised histogram of the slaves (i.e., the number density of s-clusters), n slv (s), to 
the site- normalised histogram on the master node, n mst (s), the total histogram, n(s) is obtained, 



n(s) = n mei (s) + n eW (s) 



(1) 
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This result does not involve any approximation and is independent of the number of realisations. Because the 
superposition Eq. Q can be postponed until postprocessing, the slaves can store these data locally. Moreover, 
because all relevant information is encoded in the patches, when and whence they arrive at the master node is 
arbit rary . Hence the algorithm is asynchronous, in contrast to standard techniques of parallelisation, for example 
Ref. 

The master node can itself be considered a slave node and prepare a border configuration for another master 
node, so that one obtains a tree-like structure of master and slave nodes, where statistics can be obtained on every 
level. We have used this scheme to produce a single lattice of size (22.2 x 10 6 ) 2 sites, and have calculated its cluster 
size distribution. For large L the CPU-time required for networking becomes negligible, as shown in Table [I] The 
complexity of the master gluing algorithm is O (mL log L), while the slaves need 0(L 2 logL) time to produce a patch, 
which is represented in 0(L) memory. Therefore, the optimal number of slaves per master in which the master fully 
utilises its resources, while not blocking any slaves, scales like L. At the same time, the relative networking overhead 
per slave scales like 1/L. Table |U shows the corresponding measurements. 
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Slave nodes per master 


Approximate overhead 


100 


2 


4.8% 


200 


4 


2.9% 


500 


10 


1.4% 


1000 


22 


1.7% 



TABLE I: The optimal number of slaves and relative networking overhead of the slave nodes. The master node used was 
roughly twice as fast as the slave nodes and applied six different boundary conditions on 14 different aspect ratios from each 
set of 900 patches of size L 2 produced by the slaves. 

Debarring correlations introduced by the random number generator, all patches arriving at the master are 
statistically independent. However, it is possible to recycle incoming patches by arranging them in different 
configurations (e.g., boundary conditions or aspect ratios). The results for these different configurations are not 
statistically independent. An upper bound can be calculated for the error introduced by this procedure. Rather than 
recycling all patches q times (e.g., for q = 14 different aspect ratios), one could distribute them evenly among q bins, 
now all statistically independent. The error in the estimator for the mean of an observable in the q bins would be a 
factor y/q larger than that for the complete sample. Therefore, when considering results for q bins while using the 
same complete sample in each bin, the upper bound for the error is Jq~ times the error for the complete set. When 
patches are recycled it is possible to reduce the correlations by randomly rotating, mirroring and permuting them. 
We have done this in all simulations. 

As an application of the algorithm, various cluster size distributions for site and bond percolation for q = 14 
different aspect ratios, r = width/height, between 1 and 900 were calculated. The slaves produced square patches 
of three different sizes, L = 10, 100, 1000, of which m = 900 were glued at the master node to form q superlattices 
with N = 300 2 ,3000 2 ,30000 2 sites. The simulations were performed at critical density p c = 0.59274621 for site 
percolation Q, and p c = 1/2 for bond percolation ^4|. All numerical results are based on at least 10 6 independent 
realisations (i.e., roughly 10 9 realisations at the slave nodes). Free boundary conditions have been applied everywhere. 
The random number generator used was the so-called Mersenne-Twister |l5j . which is highly suitable for parallel 
simulations. 

The site-normalised cluster size distribution n Sj f,(s; r) is the number density of s-clusters for aspect ratio r. Hence- 
forth, subscripts s and b refer to site and bond percolation, respectively. For large cluster sizes near p c , n Sj b(s;r) is 
expected to behave like 

n s ,b(s;r) = a Syb (r)s~ T g(s/s° sb ;r) (2) 

where, in a finite system of effective size L, b — b s ^(r)L D , and Q is the scaling function. The effective size can be 

taken as anything that scales linearly in \N. The universal critical exponents are r and D, while the amplitudes 
Gs,b(f) and b Sj b(r) are nonuniversal, and set by two arbitrary conditions on Q. Figure [3] shows s T n s (s;r) for different 
values of r, using t = 187/91 0. 

Two interesting features emerge. Independently of L, the shape of the distribution changes abruptly at around 
r = 2.25 and the maximum of the rescaled distribution is seemingly constant for larger r. Therefore, there is no 
possible choice of L that can collapse the scaling function for different aspect ratios, and Q explicitly depends on r. 
The inset in Fig.|2|shows n s (s; r) — n max;S (s; r)/N at r = 1, where n max;S ,b(s; r) denotes the distribution of the size of 
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FIG. 4: Universal moment ratios jm;s,b(r) for different aspect ratios and system sizes. 



the largest cluster. It seems that the sudden change in the shape of n Sj b(s;r) is caused by a change in n max;B b (s; r), 
but what happens at this particular value of r remains an open question. 
If we define the moment ratios as 

with (s k ) s 6 (r) = J s k n S: b(s; r) ds, then site and bond percolation should differ by powers of the factor 

a s (r)/a b (r) 



(b s (r)/b b (r)y~ 



(4) 



which is obtained by calculating the moments with the help of Eq. J5J. However, we find numerically that this factor is 
unity, i.e., that the ratio a(r)/6(r) T ~ 1 is the same for site and bond percolation. This ratio is not a universal function, 
because its value depends on the conditions imposed on Q for determining a(r) and b(r). However, numerics suggests 
strongly that, once these conditions are given, this ratio is independent of the lattice type, i.e., Eq. J2J represents a 
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universal moment ratio. Therefore, it is possible to write 

a.,b(r) = K?(r)q(r), (5) 

where q(r) depends only on the choice of the two conditions imposed on Q , but not on the lattice type. As mentioned 
above, Q is necessarily an explicit function of r, so that it can absorb q(r) defined in Eq. (J5J, thereby fixing one of 
the two conditions on Q . The remaining condition determines (together with the choice of L) the remaining free 
parameter. Consequently, we conclude that Eq. J5J can be replaced by 

n Slb (s;r) = b T s ; b \r) S - T g( S /(N D ^ b );r). 

Of course, 6 Sj ;,(r) r_1 cannot be absorbed into Q in the same way as q(r) because it depends on the lattice type. Thus 
all the characteristics of the lattice enter solely through b. For completeness we note that numerically the ratios 
a s (r) / ab(r) and b s (r)/bt,(r) are independent of r, no matter what conditions are imposed on Q. 

The order parameter of percolation is the fraction of sites belonging to the spanning (or largest) cluster. Thus, one 
expects the moment ratios 

7m = - m/2 (6) 

with (s fc ) m ^ x b (r) = J s k n max . s b(s; r) ds to be universal. Figure H| shows the behavior of this ratio for different aspect 
ratios r. A pronounced bump appears at around r = 9. The origin of the bump remains unclear, and can be used to 
define a characteristic aspect ratio. 

In conclusion, the method proposed in this paper permits the use of resources usually considered too slow, small, 
or badly connected. At the same time, it takes advantage of parallelisation by providing a very flexible framework for 
simulating different boundary conditions and aspect ratios. By way of illustration, we have increased Tiggemann's 
former world record |13j | for the largest simulated system by a factor of 30. The new record was set by an undergraduate 
computer cluster (as opposed to a Cray T3E) when idle. The data presented have remarkable numerical accuracy and 
are from systems of unprecedented size. They give rise to a number of urgent questions, namely how to reconsider the 
non-universal amplitudes in Eq. (J2J), and how to account for a characteristic aspect ratio as provided by the moment 
ratios of the largest cluster size distribution. 
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